Omics Data Analysis

I have applied a range of statistical methods and developed software platforms and analytical tools to process and analyze large-scale sequencing datasets. These tools facilitate the discovery of patterns and similarities across diverse omics datasets, enabling the construction of statistical models that support robust hypothesis generation.

In various projects, I have integrated data from multiple sources, including:

  • Gene expression data (RNA-seq, scRNA-seq),
  • DNA methylation profiles (Bisulfite-seq, RRBS, methylation arrays),
  • Open chromatin regions (ATAC-seq),
  • Transcription factor binding sites (ChIP-seq),
  • Data from specialized protocols and methods, such as DRIP-seq and RDIP-seq, to detect DNA-RNA hybrids.
  • Therapies, drugs and biomarkers from databases of internal and external clinical trials

Statistical Analysis

Examples consist of:

  • Survival analysis: the Kaplan-Meier Estimator and the Cox Proportional Hazards Model

  • Regression analysis: incl. linear regression

  • Classification Methods: incl. logistic regression, elastic net, random forests, and support vector machines (SVMs), positive-unlabelled (PU) learning

  • Unsupervised Methods, e.g., PCA, MOFA, autoencoders